10 research outputs found

    Digitaalse teadmuse arhiveerimine – teoreetilis-praktiline uurimistöö Rahvusarhiivi nĂ€itel

    Get PDF
    VĂ€itekirja elektrooniline versioon ei sisalda publikatsioone.Digitaalse informatsiooni pidevalt kiirenev juurdekasv on aidanud rĂ”hutada ka olulise informatsiooni sĂ€ilitamise vajadust. SĂ€ilitamine ei tĂ€henda siinkohal pelgalt fĂŒĂŒsilist varundamist, vaid ka informatsiooni kasutatavuse ja mĂ”istetavuse tagamist. See tĂ€hendab, et tegelikkuses on vaja hoolitseda ka selle eest, et meil oleks olemas vajalik riist- ja tarkvara arhiveeritud teabe kasutamiseks. Kui seda ei ole, siis saab mĂ”ningatel juhtudel kasutada emulaatoreid, mis matkivad konkreetset aegunud sĂŒsteemi ja vĂ”imaldavad niiviisi vanu faile avada. Samas, kui tehnoloogia iganemist on vĂ”imalik ette nĂ€ha, siis oleks mĂ”istlik failid juba varakult pĂŒsivamasse vormingusse ĂŒmber konverteerida vĂ”i andmekandja kaasaegsema vastu vahetada. Nii emuleerimine, konverteerimine kui ka nende kombineerimine aitavad sĂ€ilitada informatsiooni kasutatavust, kuid ei pruugi tagada autentset mĂ”istetavust, kuna digitaalse teabe esitus sĂ”ltub alati sĂ€ilitatud bittide tĂ”lgendamisest. NĂ€iteks, kui luua WordPad tarkvara abil ĂŒks dokument ja avada seesama dokument Hex Editor Neo abil, siis nĂ€eme seda faili kahendkujul, Notepad++ nĂ€itab RTFi kodeeringut, Microsoft Word 2010 ja LibreOffice Writeri esitustes vĂ”ime mĂ€rgata juba mitmeid erinevusi. KĂ”ik eelloetletud esitused on tehnoloogilises mĂ”ttes Ă”iged. Faili avamisel veateateid ei teki, sest tarkvara seisukohast lĂ€htudes peavadki esitused sellised olema. Siinjuures oluline rĂ”hutada, et ka korrektne esitus vĂ”ib jÀÀda kasutajale mĂ”istetamatuks – see, et andmed on sĂ€ilinud, et neid on vĂ”imalik lugeda ja esitada, ei garanteeri paraku, et neid Ă”igesti mĂ”istetakse. MĂ”istetavuse tagamiseks tuleb alati arvestada ka lĂ”ppkasutajaskonnaga. SeetĂ”ttu uuribki antud töö vĂ”imalusi, kuidas toetada teadmuse (mĂ”istetava informatsiooni) digitaalset arhiveerimist tuginedes eelkĂ”ige parimale praktikale, praktilistele eksperimentidele Rahvusarhiivis ja interdistsiplinaarsetele (nt infotehnoloogia kombineerimine arhiivindusega) vĂ”tetele.Digital preservation of knowledge is a very broad and complex research area. Many aspects are still open for research. According to the literature, the accessibility and usability of digital information have been more investigated than the comprehensibility of important digital information over time. Although there are remedies (e.g. emulation and migration) for mitigating the risks related to the accessibility and usability, the question how to guarantee understandability/comprehensibility of archived information is still ongoing research. Understanding digital information first requires a representation of the archived information, so that a user could then interpret and understand it. However, it is a not-so-well-known fact that the digital information does not have any fixed representation before involving some software. For example, if we create a document in WordPad and open the same file in Hex Editor Neo software, then we will see the binary representation which is also correct but not suitable for human users, as humans are not used to interpreting binary codes. When we open that file in Notepad++, then we can see the structure of the RTF coding. Again, this is the correct interpretation of this file, but not understandable for the ordinary user, as it shows the technical view of the file format structure. When we open that file in Microsoft Word 2010 or LibreOffice Writer, then we will notice some changes, although the original bits are the same and no errors are displayed by the software. Thus, all representations are technologically correct and no errors will be displayed to the user when they are opening this file. It is important to emphasise that in some cases even the original representation may be not understandable to the users. Therefore, it is important to know who the main users of the archives are and to ensure that the archived objects are independently understandable to that community over the long term. This dissertation will therefore research meaningful use of digital objects by taking into account the designated users’ knowledge and Open Archival Information System (OAIS) model. The research also includes several practical experimental projects at the National Archives of Estonia which will test some important parts of the theoretical work

    Arhiivi infosĂŒsteem 2.0

    Get PDF

    Archival Information Package (AIP) Pilot Specification

    Get PDF
    This report presents the E-ARK AIP format specification as it will be used by the pilots (implementations in pilot organizations). The deliverable is a follow-up version of E-ARK deliverable D4.2. The report describes the structure, metadata, and physical container format of the E-ARK AIP, a container which is the result of converting an E-ARK Submission Information Package (SIP) into the E-ARK Archival Information Package (AIP). The conversion will be implemented in the Integrated Platform as part of the component earkweb

    SIP Draft Specification

    Get PDF
    A Submission Information Package (SIP) is defined in the OAIS standard1 as an Information Package that is delivered by the Producer to the OAIS for use in the construction or update of one or more AIPs and/or the associated Descriptive Information. Many different SIP formats are used all over the world and unfortunately there is currently no central format for a SIP which would cover all individual national and business needs identified in the E-ARK Report on Available Best Practices. Therefore, the main objective of this report is to describe a draft SIP specification for the E-ARK project – give an overview of the structure and main metadata elements for E-ARK SIP and provide initial input for the technical implementations of E-ARK ingest tools. The target group of this work are E-ARK project partners as well as all other archival institutions and software providers creating or updating their SIP format specifications. This report provides an overview of: ‱ The general structure for submission information packages. This report explains how the E-ARK SIP is constructed by following the common rules for all other (archival, dissemination) information packages. ‱ The SIP METS Profile. We provide a detailed overview of metadata sections and the metadata elements in these sections. The table with all metadata elements could possibly be of interest to technical stakeholders who wish to continue with the more detailed work of the E-ARK SIP implementation later. Two examples with different kinds of content (MoReq2010, SIARD-E) following the common structure for EARK submission information package can be found in the appendixes to this report

    Records export, transfer and ingest recommendations and SIP Creation Tools

    Get PDF
    This report describes a software deliverable as it delivers a number of E-ARK tools: ‱ ERMS Export Module (a tool for exporting records and their metadata from ERMS in a controlled manner); ‱ Database Preservation Toolkit (a tool for exporting relational databases as SIARD 2.0 or other formats); ‱ ESSArch Tools for Producer (a tool for SIP creation); ‱ ESSArch Tools for Archive (a tool for SIP ingestion); ‱ RODA-in (a tool for SIP creation); ‱ Universal Archiving Module (a tool for SIP creation). In addition, an overview of Pre-Ingest and Ingest processes will be provided by this report which will help to understand the tools and their use

    SMURF (Semantically Marked Up Record Format) Profile

    Get PDF
    The purpose of this report is to describe SMURF (semantically marked up record format) profile, which includes ERMS (electronic records management systems) and SFSB (simple file-system based) records as described below. When extracting information from a producer’s system one has the choice of two generic options: 1. Extracting data in a relational database structure Extracting data from a relational database into a long-term preservation format (SIARD) that preserves the properties of the relational database so that the data can be imported into a relational database management system (RDBMS) on Access. Access can happen via database queries or via a search field. The main access use cases are: a. The producer wishes to retrieve their data for business purposes and/or re-use. b. The consumer wishes to consult the data for purposes of research. c. The archivist wishes to retrieve the data for professional treatment: to check and, if necessary perform preservation actions, etc. More information about this option can be read in the SIARD 2.0 Profile Specification. 2. Extracting data and metadata as records Extract the records and normalise them to a standard E-ARK XML format. This means that the records are semantically marked up using metadata. Being technically valid and complying with this specification makes them directly accessible for validation, data management, indexing and searching. Their structured semantic metadata description is explicit rather than hidden inside a RDBS. The representation of descriptive metadata inside the archive can be in the E-ARK SMURF AIP format and/or another native archive format. The main advantages over the RDBS representation are that: o Records from different sources can be merged. o Search and access is possible across all records from all sources. o Records can be managed and accessed uniformly. o The original database / records system software does not need to be licensed and preserved

    Detailed Pilots Specification

    Get PDF
    The Electronic Archiving Service consists of a series of activities covered by software tools and manual workflow steps. These tools are currently partly in existence, some are being developed by E-ARK project, many more are to be added by developments of the digital preservation community in the future. The role of this report is to identify the most relevant scenarios for the E-ARK Service, define which scenario which level of activity is needed in order to bridge the gap of the currently existing solutions (e.g. integration, software development, interface definition

    Best Practice:SIP specification, records export requirements, transfer and ingest

    Get PDF
    This report provides an overview of the current situation of the digital archiving best practices. Special attention is placed on archival ingest workflows, submission information package formats used for transfer and ingest of digital objects and their metadata. Records export best practices are covered as well. The report consists of the following parts: ‱ introduction; ‱ description of the methods used for the analysis; ‱ overview of the results with short descriptions of practices, standards and tools; ‱ recommendations for the E-ARK project; ‱ appendices (the survey questions, an assessment of the interviewed stakeholders, the questions from the qualitative interview and a terminology list). The study concentrates on the following topics from the archival workflow: ‱ Records export (Pre-Ingest workflow steps); ‱ Steps in Ingest workflow; ‱ Submission information packages (SIP) used. Highlighted points of this best practice report for E-ARK work are: ‱ One high-level (pre-) ingest workflow is proposed in section 4 which consists of 4 phases of the PAIMAS methodology, but several existing workflow parts must be examined more deeply to include the common steps to the E-ARK archiving workflow; ‱ E-ARK needs to develop detailed and commonly understood requirements for the records export process which include procedures for data selection, extraction, metadata mapping, validation and quality control as these are currently lacking; ‱ One high-level SIP structure is proposed in section 4. (Recommendation for further work), but several existing SIP physical and logical structures must be examined more deeply to include the common aspects of formats used at archives into the E-ARK SIP specification

    Recommended Practices and Final Public Report on Pilots

    Get PDF
    This report summarizes pilot activities, achievements and best practice recommendations using the following chapter structure: Chapter 1 - This introductory chapter. Chapter 2 - Planning and executing the E-ARK pilots Summary of all pilot related activities in the 3 years of the pilot, from planning to evaluation. Chapter 3 - Pilot overview A brief overview of the full-scale and additional pilots. Chapter 4 - Pilot report Summary of the pilot execution and results with recommended practices and further development recommendations. The chapter consists of the following sections for each full-scale pilot: Pilot scenario details Execution report Changes to previous plans Feedback report, and Recommended practices and lessons learnt. Chapter 4 ends with an overview of the external evaluations performed by non-EARK member organizations. Chapter 5 - Pilot evaluation Evaluation of the full-scale pilot against project objectives and success criteria. Chapter 6 - Referenced documents and web pages Appendix 1 – Extract from E-ARK Description of Wor
    corecore